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The estimation of the Levy density, the infinite-dimensional parameter controlling the jump 
dynamics of a Levy process, is considered here under a discrete-sampling scheme. In this setting, 
the jumps are latent variables, the statistical properties of which can be assessed when the 
frequency and time horizon of observations increase to infinity at suitable rates. Nonparametric 
estimators for the Levy density based on Grenander's method of sieves was proposed in Figueroa- 
Lopez [IMS Lecture Notes 57 (2009) 117-146]. In this paper, central limit theorems for these 
sieve estimators, both pointwise and uniform on an interval away from the origin, are obtained, 
leading to pointwise confidence intervals and bands for the Levy density. In the pointwise case, 
our estimators converge to the Levy density at a rate that is arbitrarily close to the rate of the 
minimax risk of estimation on smooth Levy densities. In the case of uniform bands and discrete 
regular sampling, our results are consistent with the case of density estimation, achieving a 
rate of order arbitrarily close to log~^''^(n) ■ n~^^^ , where n is the number of observations. The 
convergence rates are valid, provided that s is smooth enough and that the time horizon Tn and 
the dimension of the sieve are appropriately chosen in terms of n. 

Keywords: confidence bands; confidence intervals; Levy processes; nonparametric estimation; 
sieve estimators 

1. Introduction 

1.1. Motivation and preliminary background 

In the past decade. Levy processes have received a great deal of attention, fueled by 
numerous applications in the area of mathematical finance, to the extent that Levy 
processes have become a fundamental building block in the modeling of asset prices with 
jumps (see, e.g., [9] and [13] for further information about this field). The simplest of 
these models postulates that the price of a commodity (say a stock) at time t is given as 
an exponential function of a Levy process X := {Xt}t>o- Even this simple extension of 
the classical Black-Scholes model, in vifhich X is simply a Brownian motion with drift, 
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is able to account for several fundamental empirical features commonly observed in time 
series of asset returns, such as heavy tails, high kurtosis and asymmetry. Levy processes, 
as models capturing some of the most important features of returns and as "first-order 
approximations" to other more accurate models, are fundamental for developing and 
testing successful statistical methodologies. However, even in such parsimonious models, 
there are several issues concerning the performing of statistical inference by standard 
likelihood-based methods. 

A Levy process is the "discontinuous sibling" of a Brownian motion. Concretely, X = 
{Xt}t>o is a Levy process if X has independent and stationary increments, its paths are 
right-continuous with left limits and it has no fixed jump times. The later condition means 
that, for any t>0, F[AXt 7^ 0] = 0, where AXt := X{t) — limg/^t Xg is the magnitude of 
the "jump" of X at time t. Any Levy process can be constructed from the superposition 
of a Brownian motion with drift, aWt + bt, a compound Poisson process and the limit 
process resulting from making the jump intensity of a compensated compound Poisson 
process, Yj — EYt, go to infinity while simultaneously allowing jumps of smaller sizes. 
Formally, X admits a decomposition of the form 

Xt = bt + aBt + Yun I I a;(/i — /i)(dx, ds) -I- / / a;/i(da;, ds), (1-1) 

eNoJo Jg<|2.|<i JaJ\x\>i 

where i? is a standard Brownian motion and /i is an independent Poisson measure on 
IR+ X M\{0} with mean measure /2(da;,di) := v{dx)dt. Thus, Levy processes are deter- 
mined by three parameters: a nonnegative real a"^ , a real b and a measure v on R\{0} 
such that j{x^ A l)i'(d2;) < 00. The measure v controls the jump dynamics of the process 
X, in that i'{A) gives the average number of jumps (per unit time) whose magnitudes 
fall in a given set A € i3(R). A common assumption in Levy-based financial models is 
that V is determined by a function .s:M\{0} — > [0,oo), called the Levy density^ as follows: 



Intuitively, the value of s at xq provides information on the frequency of jumps with sizes 
"close" to xq. 

1.2. The statistical problem and methodology 

Wc arc interested in estimating, in a nonparamctric fashion, the Levy density s over a 
window of estimation D := [a, b] C M\{0}, based on discrete observations of the process on 
a finite interval [0, T]. In general, s can blow up around the origin and, hence, we consider 
only domains D that are "separated" from the origin, in the sense that D n (— e, e) = 
for some e > 0. If the whole path of the process were available (and, hence, the jumps 
of the process would be observable), the problem would be identical to the estimation 
of the intensity of a nonhomogencous Poisson process on a fixed time interval, say [0, 1], 
based on \T] independent copies of the process. Unfortunately, under discrete-sampling. 
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the times and magnitudes of jumps are latent (unobservable) variables. Nevertheless, it 
is expected that the statistical property of the jumps can be inferred when the frequency 
and time horizon of observations increase to infinity, which is precisely the sampling 
scheme we adopt in this paper. 

Nonparametric estimators for the Levy density were proposed in [14], under continu- 
ous sampling of the process, and in [11], under discrete sampling, using the method of 
sieves. The method of sieves was originally proposed by Grenander [17] and has been 
applied more recently by Birge, Massart and others (see, e.g., [1, 4]) to several classi- 
cal nonparametric problems, such as density estimation and regression. This approach 
consists of the following general steps. First, choose a family of finite-dimensional linear 
models of functions, called sieves^ with good approximation properties. Common sieves 
are splines, trigonometric polynomials and wavelets. Second, specify a "distance" met- 
ric d between functions, relative to which the best approximation of s in a given linear 
model S will be characterized. That is, the best approximation of s on 5 is given by 
d{s,s'^) = inip^s d{s,p). Finally, devise an estimator s, called the projection estimator, 
for the best approximation of s in S. 

The sieves considered here are of the general form 

S := + • • • + Pd^d : . . . , /3d e M}, (1.2) 

where ipi, . . . ,ipd are orthonormal functions with respect to the inner product {p, q) u := 
Jj^p{x)q{x) dx. In the sequel, ]| • || := || • [l^i stands for the associated norm (•,-)d^ on 
h^{D,dx). We recall that, relative to the distance induced by [j • [j, the element of S 
closest to s, that is, the orthogonal projection of s on 5, is given by 

d 

s^(a;):=^/3(^,)^,(x), (1.3) 

where fi{fj) ■= {'Pj,s) d — Jjj dx. Thus, under this setting, the method of sieves 

reduces to the estimation of the functional 

Pif) — / 'p{x)s{x)dx 
Jd 

for certain functions ip. In Section 3, we propose estimators for /3{ip) and, as a by-product, 
we develop projection estimators s on S. 

Following [11], we further specialize our approach and take regular piecewise polyno- 
mials as sieves, although similar results will hold true if we take other typical classes of 
sieves, such as smooth splines, trigonometric polynomials or wavelets. For future refer- 
ence, let us formally define the sieves. 

Definition 1.1. Sk^m stands for the class of functions </? such that for each i ~ 0, . . . 

1, there exists a polynomial qi^k of degree at most k such that ip{x) = qi_m{x) for all x in 

{xi-i,Xi], where Xi = a + i{b — a)/m. 
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It is easy to build an orthonormal basis for Sk.m using the orthonormal Legendre 
polynomials {Qj}j>Q on L^([— 1, 1], dx). Indeed, the functions 



with i — 1, . . . , m and j = 0, . . . , fc, form an orthonormal basis for Sk,m- For future refer- 
ence, let us recall that 

|g,(x)|<l and |g;(x)|<Q;(l)^^^:^. (1.5) 

We now review a few points of [11] in order to motivate the results in this paper. It is 
proved in [11] that by appropriately choosing the number of classes m and the sampling 
frequency high enough (both choices determined as a function of the time horizon T) , the 
resulting projection estimator on Sm.k attains the same rate of convergence in T as the 
minimax risk on a certain class Q of smooth functions. Specifically, the referred minimax 
risk, defined by 



inf sup Es 

«T see 



{st{x) — s{x))'^ dx 



(1.6) 



where the infimum is over all estimators st based on {Xt}t<T, converges to at a 
rate 0(T~^"/(^""'"^^) as T — >■ oo (see [11], Theorem 4.2). The parameter a characterizes 
the smoothness of the Levy densities s €@ on the interval [a, 6], in that if s is r-times 
differentiable on (a, b) {r — 0,. . .) and 

\s^^\x)-s^^Hy)\<L\x^yr (1.7) 

for all x,y E (a, b) and some L < oo and k G (0, 1], then the smoothness parameter of s 
is a := r + K. In [11], Proposition 3.5, we show that there exists a critical mesh St > 
such that if the time span between consecutive sampling observations is at most St and 
rriT := [T^/^^"~^^'>]^ then the resulting projection estimator, denoted by st, is such that 

limsupr2"/(2"+i)supE||s-3'Tir <oo. (1.8) 
T->oo see 

Of course, an "explicit" estimate of St is necessary for practical reasons. In Section 2, 
we show that it is sufficient that St = 0(r~^), improving a former result in [11] (see 
Proposition 3.7 therein). 

Note that the convergence in (1.8) is in the integrated mean square sense. A natural 
question, one which we consider in this paper, is whether or not projection estimators 
St on Sk,m can be devised such that 

T"/(2"+i)(st(x) - s(x)) Act(x)Z (1.9) 
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holds for a standard normal random variable Z, for each fixed x G D. We were unable 
to obtain (1.9) due to the fact that the bias of the estimator st, namely Mst{x) — s{x), 
is just 0(r-"/(2a+i))^ However, for any /? < „ " , , we can devise a projection estimator 

such that 

Tl^{4ix)-six))^a{x)Z. (1.10) 

The idea is to use "undersmoothing" to make the effect of bias negligible. Our results are 
in keeping with those obtained in other standard nonparametric problems, such as density 
estimation and functional regression, using local nonparametric methods such as kernel 
estimation (see, e.g., [18]). We were unable to find a reference where undersmoothing is 
used in a global nonparametric method such as the sieves method and, hence, this could 
be an additional contribution of the results presented here. 

An important extension of the pointwise central limit theorems is the development 
of global measures of deviation or asymptotic confidence bands for the Levy density. 
In this paper, we establish these methods for piecewise constant and piecewise linear 
regular polynomials (although we believe the result holds true for a general degree), fol- 
lowing ideas of the seminal work of Bickel and Rosenblatt [3] . There are some important 
differences, however, starting from the fact that Bickel and Rosenblatt considered ker- 
nel estimators for probability densities, while, here, we consider a global nonparametric 
method. In spite of these differences, our results arc consistent with the case of density 
estimation, achieving a convergence rate of order arbitrarily close to log^^^^(n) • 
where n is the number of observations. Again, the rate is valid provided that the time 
horizon T„ and the dimension of the sieves is appropriately chosen. 

The paper is structured as follows. In Section 2. we derive a short-term ergodic property 
of a Levy process, which plays a fundamental role in our results. In Section 3, we introduce 
the projection estimators for the Levy densities and show pointwise central limit theorems 
for them. The uniform case and the resulting confidence bands are developed in Section 
4. Section 5 illustrates the performance of the projection estimators and confidence bands 
using a simulation experiment in the case of a variance gamma Levy model. Finally, two 
appendices collect the technical details of our results. 



2. An useful small-time asymptotic result 



The critical time span St required for the validity of (1.8) was characterized in [11] by 
the property that 



sup 

yeD 



1^A>y]-K[2/,oo)) 



<k- 



T 



(2.1) 



for all < A < St, where fc is a constant (independent of T and A). For practical reasons, 
an "explicit" estimate of this critical mesh is necessary. The following proposition shows 
that St = suffices and serves as the fundamental property of Levy processes used for 
the asymptotic theory developed in this paper. The proof of the proposition is provided in 
Appendix A; also, see [15] for related higher order polynomial expansions for F{Xt > y). 
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Proposition 2.1. Suppose that the Levy density s of X is Lipschitz in an open set 
Dq containing D — [a,b] C M\{0} and that s{x) is uniformly bounded on \x\ > 6 for any 
S > 0. Then, there exist a k > and a to > such that, for all <t < to, 



sup 

yeD 



< kt. 



(2.2) 



3. Pointwise central limit theorem 

Throughout this paper, we assume that the Levy process {Xt}t>o is being sampled over 
a time horizon [0,T] at discrete times = t < • • • < t^^ = T. We also use the notation 
ttt '■= {t^}^Zo ttt '■= maxfcjty — t'^~^}, where we will sometimes drop the subscript 
T. The following statistics are the main building blocks for our estimation: 

1 

/3""M:=yE^(^t^-^4-)- (3T) 

k=l 

In the case of a quadratic function (p{x) ~ x'^ , X^I—i vi-^ti^ ~ -^t''^^^ ^^'^ so-called re- 
alized quadratic variation of the process. Thus, the statistics (3.1) can be interpreted 
as the realized (/s-variation of the process per unit time based on the observations 
XjO^, . . . ,XjjT . The estimators (3.1) were proposed independently by Woerner [25] and 
Figueroa-Lopez [10]. 

The main virtue of the statistics (3.1) lies in its application to recover (3{(p) := 
J ip{x)s{x) dx as T ^ oo and ttt — > for bounded i/-continuous functions ip such that 
(p{x) fast enough as a; — > 0. This result was obtained in [25] (Theorem 5.1 therein) 
for regular sampling schemes and in [12] (Proposition 2.2 therein) for general sampling 
schemes and a more general class of functions (p (see also [11], Theorem 2.3, for related 
central limit theorems). The consistency of for /3(<p) leads us to propose 

d 

r(x):=E^-(^,)^,(x) (3.2) 
i=i 

as a natural estimator for the orthogonal projection s-^ defined in (1.3). The nonpara- 
metric estimator (3.2) was proposed in [10], where the problem of model selection was 
also considered under continuous-time sampling. 

As was discussed in the Introduction, one can construct a projection estimator on 
the regular piecewise polynomials S = Sk,m of Definition 1.1 that converges to s, under 
the integrated mean square distance, at a rate at least as good as r-2a/(2a+i)^ Such a 
rate can be ensured by "tuning" the number of classes m in the sieve, as well as the 
sampling frequency tt, to both the degree of smoothness a of s and the time horizon T. 
It is natural to wonder whether it is possible to construct a projection estimator st such 
that 

r"/(2"+i) [st{x) - s{x)) A aZ 



Confidence intervals and bands for Levy densities 



649 



as r — )- oo , for Z ^ A/'(0, 1) and a constant a. We are unable to obtain this result due to the 
fact that the bias Kst{x) — s(x) of any projection estimator st is, at best, 0(T~"/(^"+^)). 
However, in this section, we show that for any < /3 < j^qrj-, there exists a projection 
estimator such that 

c'j^{s^{x) — s{x)) aZ 

for a normalizing constant dj. x (i.e., fcT^ <c'rp< kT^ for some constants k,k(z (0, oo) 
independent of T). As it is often the case, our approach consists of first obtaining a 
central limit theorem for s{x) centered at Es(a;) with normalizing constants x 
and, subsequently, making the bias Es{x) — s{x) to be o(c^^). The central hmit theorem 
for s{x) follows from a classical central limit theorem for row- wise independent arrays. 

Below, Legendre polynomials {Qj}j>o on L^([— 1, l],da;) are used to devise an or- 
thonormal basis for the sieve Sk,m of Definition 1.1. Also, we consider Levy densities s 
whose restrictions to D [a,b] belong to the Bcsov class B^{L°°{[a,b])) (i.e., functions 
satisfying (1.7) with r g N and k G (0, 1] such that a = r + k). The following is the main 
theorem of this section. Its proof is deferred to Appendix B. 

Theorem 3.1. Suppose that the Levy density s of X satisfies the conditions of Proposi- 
tion 2.1 and belongs to i3^(L°°([a, 6])) for some a > 1. Let ct be a normalizing constant 
and let st be the projection estimator on Sk.niT based on sampling times ttt such that 
the following conditions are satisfied: 

2 

(i) CT — > oo; (n) ^ — > 1; (m) CTmrTTT — > 0; 
(iv) CTm^°' '^^^^ 0; (v) k>a — l. 



Then, for any fixed x G (a, b) for which s{x) > 0, 



{st{x) — s{x)) a{x)Z, (3-3) 



bk,mT (^) 



where 



Z-AA(0,1), a^{x):=:{b~a)-h{x), 

■■= E (2j + 1) E (^J^^S^^1±^\ I (^). 

j=Q 1=1 ^ 111/ 



Also, for any fixed < j3 < 2a+i ' resulting projection estimator st with niT = [T^ "^^j 
is such that 

{st{x) - s{x)) — > a{x)Z, 



provided that ttt = T~'^ with 7 > 1 — /3. 
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Remark 3.2. 

(1) In view of (1.5), 1 < h^.m < S^=o(2j + ^) and, hence, the normalizing constant 
c'rp := ct /hk niT ^ cy. Also, note that hk m = 1 hi the piecewise constant case (fc = 
0). 

(2) Theorem 3.1 wiU allow us to construct approximate confidence intervals for s{x). 
Concretely, the 100(1 — a)% interval for s{x) is approximately given by 

where Za/2 is the a/2 normal quantile. 

4. Confidence bands for Levy densities 

In this section, we address the problem of constructing confidence bands for the Levy 
density s of a Levy process using projection estimators on Sk,m based on n evenly- 
spaced observations of the process at to = < • • • < i„ = T on [0,T]. Confidence bands 
entail the limit in distribution of the uniform norm 

Pt - s|l[a,b] sup \s^{x) - s{x)\, 

but, as before, we will first work with the uniform norm of 

Y^\x):=sl^{x)-Es^x), x<E[a,b], (4.1) 

and then estimate the uniform norm of the bias Esy(.T) — s{x). We follow ideas from 
the seminal paper of Bickcl and Rosenblatt [3] , wherein confidence bands for probability 
densities are constructed based on kernel estimators. There are two fundamental general 
directions in Bickel and Rosenblatt's approach: 

(1) the statistics of interest are expressed in terms of the so-called uniform standard- 
ized empirical process 

Z°(a;):=nV2{F*(x)-x}, :.e[0,l], (4.2) 

where, denoting by Ft the distribution of Xt and by S" :~ ti — the time span 
between observations, F*{-) is the empirical distribution of {Fgn {Xt - — Xt-_j^)}i<n] 

(2) the empirical process is approximated by a Brownian bridge and the er- 
ror is estimated using Brillinger's result [5] or the Komlos, Major and Tusnady 
construction [19]. 

Once the statistic of interest is related to the Brownian bridge we will carry over 
several successive approximations (see Appendix C for the details) , which will allow the 
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distribution of ||y^||[a,6] to be connected with the hmiting distribution of the extreme 
value 

max {cf^} 

of independent copies {Cj^^}j of the random variable 

(4.3) 



C^''^ := sup 



k 
3=0 



where Zj are i.i.d. standard normal random variables. The problem is then reduced to 
finding the extreme value distribution of a random sample from (4.3). For instance, in 

the case = 0, ('■"''''i^'^'lZol, which is known to satisfy 



lim P max |C "^ | < — + 6,„ = e-"<= (4.4) 



, Or. 

for any y > 0, where 

a,„ = (21ogm)i/2, (4.5) 
6,„ = (21ogm)^/2 - i(21ogm)"i/2(loglogm + log47t). (4.6) 

We are also able to tackle the case k = 1, where C*-^-' = |Zo| + -y/Sl^il, but the general 
case is still under investigation. Our assumptions are as follows. 

Assumption 1. 

(1) s is positive and continuous on [a, 6]. 

(2) s is differentiable in {a,b) and, moreover, the derivative of s^^^ is bounded in 
absolute value on (a, 6). 

We are ready to present the main result of this section. We defer its proof to Ap- 
pendix C. 

Theorem 4.1. Suppose that i/(R) = oo or a ^ 0. Also, suppose that the Levy density 
s satisfies the conditions of Proposition 2.1 and the Assumption 1. Let T„ — >■ oo and 
mn oo be such that 

/•\ r?) 1 cTj 1 n— >-oo ^ /.., log n n— >oo ^ 

(i) S logd -mnlogmn — > 0, (u) — m„logm„ — > 0, 

where (5„ :~ Tn/n. Then, for k e {0, 1} , the deviation process of (4-1 ) satisfies 



iaraAKT^'^ sup \s-'l\x)Y:f\{x)\ - b„A < y) =c-''''="^ (4.7) 



hm P 
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where r„ := Tn/rUn, and bm are defined as in (4.5)-(4.6) and (k, k') ~ {{b — a)"'/^, 2) 
ifk = or (K,K') = ((fe-a)i/22-\4) ifk = l. 

The previous result shows that 

a„ /kT^^ sup s'^^^ix)\s^Jx)-Es^ix)\-b,nA 

x£[a,b] ^ 

converges to a Gumbel distribution. The final step in constructing our confidence bands 
consists of finding conditions for replacing EsJ with s. The following result shows this 
step. Its proof is presented in Appendix C. 

Corollary 4.2. Suppose that the conditions of Theorem ^.1 hold true, that the restriction 
of s to [a,b] is a member of B'^{L^{[a,b])) and also that 

(iii) r„mi-^"log2m„™0. (4.8) 

Then, 

\imr(a,nJnf,l/^ sup ^7^|%(x) - - 6™ 1 < = e"'^'^"", (4.9) 

where we have used the same notation for k and k! as in Theorem 

The previous corollary allows us to construct confidence bands for s on [a, b] based on 
the projection estimators s on regular piecewise linear (or constant) polynomials. Indeed, 
suppose that ?/* is such that exp{— fc'e~^°} = 1 — a and let 

d — + b \f-^/^ 

Then, as n — c», 

e {sU^)+{dl±^{sl^Jx)+dlf - (%(x))2}), (4.10) 

with 100(1 — a)% confidence. The above interval is asymptotically equivalent to the 
following, simpler, interval: 

six) e (^s-^ix) ± ^ + 6™„)t-^/^(.sJ„(x))1/2) . (4.11) 

We conclude this section with some final remarks. 

Remark 4-3. In the case where T„ := c„ • n"^ and to„ = [d„ • n"^], for some ai,a2 > 0, 
c„ X 1 and d„ X 1, the conditions (i)-(ii) of Theorem 4.1 are satisfied if < ai < 1 and 
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< q;2 < (1 — ai) A ai. Also, it can be checked that condition (iii) of Corollary 4.2 is met 
if 

2a + 1 q;i 
0<ai<— — r and — ^ < < (2 - 3ai) A ai. (4.12) 
6a + 2 1 + 2a 

Note that {a2 — ai)/2 can be made arbitrarily close to —a/{3a + 1) on the range of 

~ — 1 /2 

values (4.12) and, thus, am,Tn can be made to vanish at a rate arbitrarily close to 
(logn)~^/^n~"/(^"+^\ provided that a is large enough. In particular, if < 1 and s 
is smooth enough, then m„ and T„ can be chosen such that 

P?„-s||[„,,]=0(log-i/2(n)n"V3+e)_ 



5. A numerical example 

Variance gamma processes (VG) were proposed in [20] and [7] as substitutes for Brownian 
motion in the Black-Scholes model. Since their introduction, VG processes have received 
a great dealt of attention, even in the financial industry. A variance gamma process 
X — {X{t)}t>o is a time-changed Brownian motion with drift of the form 

X{t) = eU{t)+aW{U{t)), (5.1) 

where {VF(t)}t>o is a standard Brownian motion, G R, <t > and U ~ {U{ty\t>o is an 
independent gamma Levy process such that E[C/(t)] = t and Var[C/(t)] = vt. Since gamma 
processes are subordinators, the process X is itself a Levy process (see [23], Theorem 30.1) 
and its Levy density takes the form 

Aexp(-MY if.^<o, 
six) = { 1^^' ; . ^ (5.2) 

-exp(^-^j, ifx>0, 

where a > 0, > and /3+ > with |/?^| + > (see, e.g., [9] for expressions for 
I3±,a in terms of 9, a and v). In that case, a controls the overall jump activity, while /S"*" 
and /3~ take charge of the intensity of large positive and negative jumps, respectively. 
In particular, the difference between l//?"*" and l//3~ determines the frequency of drops 
relative to rises, while their sum measures the frequency of large moves relative to small 
ones. 

The performance of projection estimation for the variance gamma Levy process was 
illustrated in [11] via simulation experiments. In this section, we want to further extend 
this analysis to show the performance of confidence bands. As in [11], we take as sieve 
the class 5o,m, namely, the span of the indicator functions X{xq,xx\i ■ ■ ■ tX{x^^i,x^]i where 
a;o < • ■ ■ < Xm is a regular partition of an interval D = [a, b], with < a or 6 < 0. We take 
parameter values which are partially motivated by the empirical findings of [7] based on 
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daily returns of the S&P500 index from January 1992 to September 1994 (see their Ta- 
ble I). Using maximum likelihood methods, the annualized estimates of the parameters for 
the variance gamma model were reported to be 6ml = —0.00056256, = 0.01373584 
and vml = 0.002, from which it can easily be found that 

a = 500, ^+=0.0037056 and = 0.0037067. (5.3) 

These parameter values seem to be consistent with other empirical studies (see, e.g., 
[24]), although we admit that parameter values fitted to intraday high-frequency data 
would have been preferable. 

We simulate 100 samples of the VG process with a maximal time horizon of T = 10 
years and a sampling span between observations of (5 = 1/(252 x 6.5 x 60 x 12). Assuming 
a business calendar year of 252 days and a trading day of 6.5 hours, the time span between 
observations corresponds to 5 seconds. Intraday data of such characteristics is available 
via financial databases such as NASDAQ TAQ. 

We estimate the sample coverage probabilities 

Ca ■— IP'(s(-) G the 100(1 — a)% confidence band on [a, 6]), 

based on the 100 simulations for two sampling frequencies S = 1/(252 x 6.5 x 60 x 12) 
(5 seconds) and S = 1/(252 x 6.5 x 60) (1 minute), and maturities of T = 1,3,5 and 10 
years. We use two possible numbers of classes: m = 40 and the data-driven selected m 
proposed in [11]. Concretely, the selection criterion is given by 

TO := argmin{-||s7„|p +pen''(5fc,™)}, (5.4) 

m 

where is given according to (3.2) and pen'^ is given by 

2 " 

pen-(5fc,„0 = —Y^J2^l{Xt^^Xt^_,). (5.5) 

1=1 ij 

The quantity to be minimized in (5.4) is a discrete-time version of an unbiased estimator 
of the shifted risk E||s — s^jp — ||.s|p (see [11], Section 5, for more details). 

The Table 1 shows the coverage probabilities for the interval [a, 6] ~ [0.001,0.1] (based 
on 100 simulations). Overall, the coverage probabilities of the confidence bands for to = 40 
are good. In the case of the data-driven selected to, there are some values of m for which 
probabilities are quite low. Such cases occur (only) when the band does not contain 
the density very near a = 0.001. It seems more reasonable to take an average between 
different classes with values of m which are reasonably close in terms of the quantity in 
(5.5).^ 

To illustrate how close the estimated Levy density is to the true Levy density and the 
overall width of the confidence bands. Figure 1 shows the actual Levy density (solid blue 
line), the mean of the penalized projection estimator (solid red line) and the means of the 
lower and upper 95%-confidence bands (dashed lines). All the means are computed using 
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Table 1. Empirical coverage probabilities of 95% confidence bands on the interval [0.001,0.1] 
based on a piece-wise projection estimator with m classes 



6\T 


1 year 




3 years 




5 years 




10 years 




5 s 


0.97 (m 


= 40) 


0.99 (m 


= 40) 


0.97 (m 


= 40) 


0.97 (m = 


40) 




0.98 (m 


= 35) 


0.95 (m 


= 25) 


0.80 (m 


= 25) 






1 min 


0.93 (m 


= 40) 


0.94 (m 


= 40) 


0.98 (m 


= 40) 


0.87 (m = 


40) 




0.97 (m 


= 35) 


0.75 (m 


= 25) 


0.60 (jn 


= 25) 


0.94 (m = 


50) 



100 confidence bands based onS — 5 seconds and time horizons of T = 3 and T = 10 years. 
The analogous figures with a sampling time span oi S = 1 minute are shown in Figure 2. In 
our empirical results (not shown here for the sake of space) , we found that high-frequency 
data is crucial to estimate the Levy density near the origin. For instance, the confidence 
bands near the origin do not perform well when taking 30-minute observations in a time 
period of 10 years. The Table 2 gives the estimated coverage probabilities on the interval 
[0.005,0.2] based on 30-minute returns. 

Let us finish with two remarks. First, from an algorithmic point of view, the estima- 
tion for the variance gamma model using penalized projection is not different from the 
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X and 95% CB for the VG Levy Density 



-True Levy density s(x) 

- Proj. Est. on m=40 classes 

- 95% Upper Bound 

' 95 % Lower Bound 
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Figure 1. Means of projection estimators and corresponding confidence bands for the VG 
model based on 100 simulations with a sampling time span of 1/(252 x 6.5 x 60 x 12) (about 5 
seconds) during 3 years (left panel) and 10 years (right panel). 
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Table 2. Empirical coverage probabilities of 95% confidence bands on the interval [0.005, 0.2] 
based on a piece-wise projection estimator with m classes 



6\T 


1 year 




3 years 




5 years 




10 years 




30 min 


0.34 (m 
0.43 (m 


= 40) 
= 10) 


0.73 (m 
0.71 (m 


= 40) 
= 35) 


0.87 (m 
0.85 (m 


= 40) 
= 35) 


0.97 (m = 
0.97 (m = 


40) 
25) 



estimation of the gamma Levy process. We can simply estimate both tails of the vari- 
ance gamma process separately. However, from the point of view of maximum likelihood 
estimation (MLE), the problem is numerically challenging. Even though the marginal 
density functions have "closed" form expressions (see [7]), there are well-documented 
issues with MLE (see, e.g., [21]). Finally, it worth pointing out that applying an efficient 
estimation method to a misspecified model could lead to quite undesirable results, as was 
illustrated in [11], where MLE was applied to a CGMY model (sec [6]) with parameter 
values quite close to those of a gamma process. The numerical experiments in [11] show 
that a modestly efficient robust nonparametric method is sometimes preferable to a very 
efficient estimation method. 



Means of projection estimators 
xio= and 95% CB for VG Levy Density 
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Figure 2. Means of projection estimators and corresponding confidence bands for the VG 
model based on 100 simulations with a sampling time span of 1/(252 x 6.5 x 60) (about 1 
minute) during 3 years (left panel) and 10 years (right panel). 



Confidence intervals and bands for Levy densities 657 

Appendix A: Proof of Proposition 2.1 

Without loss of generality, we assume that a > 0. Consider the process 

Xt ■■= f / a;l{|^|>^}/x(da;,ds) (A.l) 



for < e < 1, which is well known to be a compound Poisson process with intensity of 
jumps Ae := i^({|a;| > e}) and jump distribution j-l{|j,|>£-j.i'(da;). The remainder process, 

:— X — X"^ , is then a Levy process with jumps bounded by e. Concretely. X^ has Levy 
triplet (cr^, 6e, l||^|<£}i'(dx)), where = b— J^^^^^^^xh'{dx). The following tail estimate 
will play an important role in the sequel: 

fQXIl >z)<exp{azologzo}exp{az-azlogz}t'^", (A.2) 

valid for an arbitrary, but fixed, positive real a G (0,e~^) and for any t,z > such that 
t < Zq^z, where zq depends only on a (see [22], Lemma 3.2, or [23], Section 26, for a 
proof). 
Define 

A,(i):=i|ip[X,>2/]-K[y,oo))' 

which, for e < | A 1 and after conditioning on the number of jumps, can be written as 

1 „ , . ^x^t I „-Ajt j 1 

\x\>e 

n j-n — 2 



Ayit) = E/,(Xf)e-^^* +e-^^* / -{E/,(Xf + x) - fy{x)Mdx) 

t J\x\>e ^ 

_l-e^ / /,(xHd.)+e-^^*f;M:f^E/Jxf + f]e.Y 

Jx>V n \ ._i / 



''=>y n=2 \ i=l 

where fy{x) = 'i-x>y The first term on the right-hand side of the above expression is 
bounded uniformly for y E [a, b] and t <tQ, for certain to (a) > 0, because of (A.2) taking 
z = a and a G (2a~^,£~^). The last two terms in the same expression are uniformly 
bounded in absolute value by i^{x > a) and J^(|a;| > e)^, respectively. We need to show that 
the second term is uniformly bounded. Define By{t) := J^^^^^{Efy{Xf + x) — fy{x)}i'{dx). 
Clearly, 

ry ry+£ 
By{t):^ i P{XI>y~x}s{x)dx- ¥{X^ < y - x}s{x) dx 

Jy~e Jy 

+ I ¥{Xl >y- x}s{x) dx- I V{XI <y~ x}s{x) dx. 

J {x<y-E,\x\>E} Jy+s 

Since s is bounded and intcgrable away from the origin, the last two terms in the ex- 
pression for By{t) can be bounded in absolute value by J^{|a;| > £}P{|X|| > e}. Dividing 
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by t, this converges to in light of the weh-known Umit 



\\m^F{Zt>z) = i.i[z,^)), 



(A.3) 



vahd for any Levy process Z with Levy measure and any point z of continuity of v 
(see, e.g., Bertoin [2], Chapter 1). The other two terms can be bounded as foUows: 



¥{X^>y-x}s{x)dx- 



"{Xl <y- x}s{x) dx 



<Ki ( V{\Xl\> u}u du + Kq [ ¥{X^ >u}du- [ F{X^ <-u}du 
Jo Jo Jo 

where Ki is the Lipschitz constant of s in Dq and Kq :=sup^g^^ Next, applying 

Fubini's theorem, we can write the expression in the last line above as follows: 

K,^E{{\X!\ A + Ko\mXI)\, 

where h{x) = xl\x\<e ~ £'^x<-e + £li:>e- Using the formulas for the variance and mean 
of a Levy process, we obtain that 



sup \¥.{[\Xl\KEf}<a^ 
o<t<i t 



X^I^(dx) + &e < CXD. 



\x\<e 



Also, 



-F.h{Xl) 



< 



t * 



-EX^,li\xn>e} 



-e-P{\xn>e}. 



The last term above converges to by (A. 2). The second term also vanishes since 



^\Exn{\xn>e}\<[-/{\xn>e}^ ' \^mi 

as i — > 0. Finally, using the formula for the mean of X^ , we have 

Yim\Eh{XI)<]iu,-\EXI\ = %\. 

We conclude that there exists a and K >Q such that for t < to, sup^g^ \By{t)\/t < K. 
This completes the proof since all other terms in Ay (t) can be easily bounded uniformly 
in D. 



1/2 



Appendix B: Proofs of the pointwise central limit 
theorem 

Throughout this section, we shall use the orthonormal basis {'Pi.j}i<i<m.o<j<k of (1.4). 
We start our proof with following easy lemma. 
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Lemma B.l. Suppose that tp has support [c,d] C M+\{0}, where tp is continuous with 
continuous derivative. Then, 



< 



\^{c)\ + I y{u)\du\M^{[c,d]) 



A 

where I3{(p) := J ip{x)s{x) dx and MA{[c,d]) := supj^g[^_^] |^P[Xa > y] - iy{[y,oo))\. 
Proof. The result is clear from the identities 

/•OO 

E(f{XA) = ^{c)F[Xa >c]+ ip'{u)P[XA > u] du, 



(p{x)jy(dx) = (p{c)i'{[c,oo)) + J (f' {u)i'{[u,oo)) du, 

which are standard consequences of Fubini's theorem. □ 

Our first result shows a central limit theorem for s{x) centered at Es(x). Let us remark 
that the fact that the Legendre polynomial Qj is not constant for j > poses some 
difficulty since the relative position of x inside its class changes greatly with m. 

Lemma B.2. Under the notation and assumptions of Theorem 3.1, it follows that 

^ -{st{x) —¥.st{x)) aZ. 



Proof. We apply a central limit theorem version for row-wise independent arrays of 
random variables (see, e.g., the corollary following [8], Theorem 7.1.2). Note that 

St ■■= -p^{sT{x) - Est(x)) 

k 

where ^j,t(') is of the form 



f2jTr / 2--{aT + bT) \ 



with Or, 6t such that x S [ax, br) and bx — ax ^ {b — a) / mx. In that case, a'^ :— Var St 
is given by 

■= rp2U2 E E 'Ph.T{x)'PJ2,T{x)Cov{ipj,^T{X^^^),ipJ^^T{X^^^)), (B.l) 

^ n,n=o 
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where we have used := — tlf^. Let us analyze the above covariances, scaled by A^. 
First, applying Lemma B.l, (1.5) and Proposition 2.1, there exists a > and K > 
such that whenever A <tQ, 



^E(Pj,,t(^a)</'j2,t(^a)- / iPj,,T{y)^j2,T{y)s{y)dy 



< 



KA 



Similarly, using the additional fact that | J (^j_j'(y)s(y) dj/| < ||s||, there exists a to > 
and K > Q such that whenever A < to : 



Thus, using assumption (iii) of Theorem 3.1, we have 
1 



<KA. 



— Cov(v3j-,,t(^a^)>¥'j2,t(^a^)) =ot(1) + j ipj,^Tix)(pj2,T{y)siy)dy, 

where 0^(1) — > uniformly in i as T — > oo. Thus, in view of the fact that > I, (1.5) 
and assumption (ii) of Theorem 3.1, we have — where 



'T - j.fj2 



V3uT{x)ifj.,^T{x) j (pj,,T{y)'Pj2.T{y)s{y) 



Ay. 



Next, the continuity of s at x, assumption (ii) of Theorem 3.1 and the fact that the 
support of tpjj- contains x and shrinks to collectively yield that 



lim -— 



il,i2=0 



Yl 'Pn,T{x)iPj2,T{x) j 'Pji,T{y)'Pj2,Tiy){s{y)- s{x)) 



dy = 0. 



This implies that limr-^oo o't " limr-i-oo o't — ^i^)/ {b ~ a), in view of condition (ii) and 
the definition of bk- Finally, we consider the "standardized" sum Zt ■= St/^t- By the 
corollary following [8], Theorem 7.1.2, Zt will converge to J\f{0,l) because 



sup 



Ct 



< 



CtTTIt 



Y l'^i,T(a;)v?j, T(^t^ - ^4-1)1 




J=0 



TaxbrnT {b~ a 

as r — > 00, in view of assumptions (i)-(ii) and the fact that bm ^ 1. This implies the 
proposition since ^ s{x){b — a)^^. □ 



The last step is to estimate the rate of convergence of the bias term. 
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Lemma B.3. Under the notation and assumptions of Theorem 3.1, V,st{x) — s{x) 
oib„irr/cT) as T — >■ oo for any fixed x £ (a, &) such that s{x) > 0. 



Proof. We use the same notation as in the proof of Lemma B.2. Obviously, 



CT 



■\EsT{^)-six)\<^Y.^TAT{^T) 



where 



At (A) 



CT 



1 ^ 



J=0 



It then suffices to show that max^ At {A'!j.) — > as T ^ oo. Note that 



At{A) < 



CT 



Y.^i^t{x)< -I 



Vj,t{Xa)- / ipj,T{v)s{y)Ay 



+ 



CT 



X! Vj.T{x)ipj,T{y){s{y) - s{x)) dy 



where we have used the fact that J ipj,T{y)dy — <5o(i)- We shah show that each of the 
two terms on the right-hand side of the above inequality, which we denote A^(A) and 
Ay, respectively, vanish as T — > oo. Using (1.5), Lemma B.l and Proposition 2.1, there 
exist a A' > and Tq > such that, for T > Tq, 



a:^(a^) < K 



CTAi, 



bmribT - At) 



< K 



CTnriTTTT 



as T — > OD, due to (i)-(iii). To deal with the term A^, wc treat the two cases a — 1 and 
a > 1 separately. Suppose that a = 1 . Using the Cauchy-Schwarz inequality twice (for 



summation and for the integral) and the fact that X]t=o ^t(^) ~ b1^^{x) / {hT — ot), we 



have 



CT 



\/bT — Ot 



1/2 



{s{y)-s{x)fdy\ <KcT{bT-aT) 



for some constant K < oo. In light of assumption (iv) of Theorem 3.1, — T 0. Let us 
now assume that a > 1 . We first note that 



/ 'Yv'j,Tix)ipj,T{y){y - xy dy = 
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for j' = 1, . . . , fc. This is because the left-hand side is p'^{x), where P'^{y) is the orthogonal 
projection of the function p{y) := {y — xY on Sk^mr and, clearly, p'^{x) = p{x) = 0. Also, 
by Taylor's theorem, 



where r := \_a\ , the largest integer that is (strictly) smaller than a. Since k > a — 1, we 
have that k> r and 



J2^j,T{x)iPj,T{y){s{y) - s{x)) dy 



3=0 



Y,V3Mx)?My) / - sW(x))^^-^d«dy 



Again applying the Cauchy-Schwarz inequality twice (for summation and for the inte- 
gral), we have 



K<-^i2\^j,T{x)\ ^(sW(^^)-s«(x))^^-^d^;dy 



< 



3=0 



CT 



k i.b. 



\/hT — ax 



'T ( ry 



E 



{r-iy. 



1/2 



dv > dy 



Finally, by the Holder condition (1.7), < Kct'ttIj,"' — > 0. 



□ 



Appendix C: Proofs of the uniform central limit 
theorem 

In this section, we show the results of Section 4. We recall that the estimators are 
based on observation of the process at evenly-spaced times ir"^ : to = < ■ ■ ■ < tn — T . The 
time span between observations is S" :~ :~ T/n. 

Let us first remark that under the assumption that cr or i^iM) = oo, the distribution 
Ft{x) is continuous for all t > (sec [23], Theorem 27.4). In particular, {FgTi^Xt- — 
^ti-i)}j<n is necessarily a random sample of uniform random variables and, hence, 
of (4.2) is indeed the standardized empirical process of a uniform random sample. Also, 
note that 



Z^{Fs. (x)) = n'/^F'\x) - Fi.(x)} V.t e M, 
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where F" := is the empirical process of {Xt^ — Xt^_^ : « = 0, . . . , n}. The fohowing 
transformation will be useful in the sequel: 

m k ^ 

£{x;m,K,H) = K^^<^ij(x)<^ (pij{xi){H{xi) - H{xi-i)) 

i=l ,=0 ^ 

^',^^{u){H{u)-H{x,.,))du\, 

where (pij is the basis element in (1.4) and i7:R— >-M is a locally integrable function. 
Note that if if is a function of bounded variation, then 

rn k „xi 

£{x;m,K,H) = K'^'^ipi,j{x) / (pij{u) dH{u). 

The following estimate follows easily from (1.5): 

sup \C{x;m, K,H)\ < K ■ K ■ m ■ (jjf H;[a,b], V (CI) 

where ii' is a constant (depending only on k) and uj is the modulus of continuity of H 
defined by 

uj{H; [a, 6], (5) = sup{|i?(u) - H{v)\ : u, u G [a, 6], |u - i;| < S}. 
Let us write the estimator (3.2) in terms of as follows: 



(2:) := E E (^-^- {^)-^c(x;m,-,F¥{-)y 
i=i 1=0 ^ ^ 



(C.2) 



Note that ]Es^(a;) admits a similar expression with F^ replaced by Fsj^. Thus, it follows 
that a.s. 

Y^ix) := s^x) - Es^{x) = C{x; m, n'^^T-\ Z°{Fs^ (•))) (C.3) 

for all X. As was explained in Section 4, one of the key ideas of the approach of Bickcl and 
Rosenblatt [3] consists of approximating Z° by a Brownian bridge Z". To this end, we 
use the following result, which follows from the Komlos, Major and Tusnady construction 
[19]. 

Theorem C. 1. There exists a probability space (f2, J^, P) , equipped with a standard Brow- 
nian motion Z , on which one can construct a version Z''^ of Z^ such that 

||Z,'J-ZO||[oa]=Op(n-i/2iog„)^ 
where Z'^ix) := Z(x) — xZ{l) is the corresponding Brownian bridge. 
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Since we are looking for the asymptotic distribution of sup^ |y^(a;)|, properly scaled 
and centered, we can work with the process Z° instead of Z'^. Thus, with some abuse 
of notation, we drop the tilde in all of the processes of Theorem C.l. The following is 
an easy estimate. Again abusing notation, the process qY^ in the following lemma is 
actually the process resulting from replacing Z°(F5n(-)) in (C.3) by Z,'J(Fin (•)). 

Lemma C.2. Let oYrp{x) C{x;m,n^^'^T~^,Z°{Fs^^{■))). It then follows that WoYf- ~ 
ll[a.6] = Op('Ttlogri/r) as n-^ CO. 

Proof. Clearly, uj{H;[a,b],6) < 2\\H\\[a,b] for any process H. Thus, wc get the result 
from (C.l) and Theorem C.l. □ 

As in [3], our approach is to devise successive approximations of oY'p{x), de- 
noted by iY^,...,nY^, such that the asymptotic distribution of the supremum 
sup^g[£jj,] |ArK^(a;)|, properly centered and scaled by certain constants 6^ ^-^^'^ '^tj 
easy to determine and such that the error of the successive approximations is negligible 
when multiplied by a^. Wc proceed to carry out this program. 

First, note that since a Brownian bridge satisfies {Z°(.t)}j;<i = {Z*'(l — x)}x<i, we 



{o>"T(a;)}xe[a.b]={i>"T(2;)}.Te[a,6], 
where iY^{x) := C{x;m,,n^/^T-^ , Z°{Fs^{-)) and F 1 - F. The following is our first 



Lemma C.3. Suppose that the assumptions of Proposition 2.1 are satisfied. There exist 
constants K and to > such that if T jn < to, then 



have 



estimate. 



2Y'P{x) = C{x; m, n 



i/^T-\ZiFsA-))) 



is such that 




for a constant K < oo. 



Proof. Clearly, 



2Y^{x) - iY^{x)=C{x;n,T,m,n 



i/2r-i,z(i)F,-„(.)). 



Thus, by (C.l), 



mn 



1/2 



\\lYi^^2Y^\\[aM<K 



T 



oj{Fs.;[a,b],d^)\Z{l)\, 
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where dm ~ {b — a)/m. In view of Proposition 2.1, for n and T such that T/n < to, there 
are constants k and k' such that 



\Fsn {u) - Fs4v)\ < 2fc((5")2 + 2fc',5"m-\ 
provided that u, w S [a, b] and — u| < dm- 



□ 



Let us now work with ■ Because of the self-similarity of the Brownian motion, we 
have that 



where 



{2^7^ (3^)}xe[a,b] ={3yTi^)}x£[a.b], 



,Y^ix) := C (x; m, T'^/^, Z (^Fg. (•) 



The following estimate results from Levy's modulus of continuity theorem. 

Lemma C.4. Let 4Y:p{x) = £{x;m,T-'^/^,Z{J°° s(u)du)). If is such that J" 
^ — 0, then, for n large enough, 



for a constant K < 00. 

Proof. It is not hard to see that there exists a constant K such that 



- iY^W < KT-^^^m sup 



Zi^FsA^) 



s{u) du 



x^[a,b] 

By Proposition 2.1. there exist constants fc > and tg > such that for all < S <to, 



sup 



< kS. 



Thus, there exists a constant A' > such that, for large enough n, 

n 



(C.4) 



□ 



We now note that 
and, hence, 



s(u) du 



3^/^{u)dZ{u) 
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where 

Using integration by parts, one can simplify ^Y^^x) as follows: 

rn k .Xi 
1=0 j=0 "'^'-1 

The following is the last estimate. 
Lemma C.5. Suppose that the Assumptions 1 in Section 4 hold true. Let 

1=0 j=0 "'^'-1 

There then exists a random variable M such that 

- (5- a)i/25-i/2(.)^y«(.)|, < MT-i/2. 

Proof. 

Let q{x) ~ s^/^(.t) and c = (b — a)^/^. Using integration by parts, we have 



= q^'^{x){if>i^j{xi)[q{xi) - q{x))Z[x.i) - ip.ij{xi-i){q[xi-i) - q[x))Z{xi-i)} 
/ Wi,j{u){q{u)-q{x))-if,^j{u)q'{u)}Z{u)du. 



Since q ^(•) and q'(-) are bounded on [a, 5], there exists a constant K such that 
sup \Hi^j{x)\<Km-'^l^ sup \Z{u)\. 

Thus, 

rp \ —1/2 m k 



<XT-i/2 sup \Z{u)\. 

ue[a,b] L-l 
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The latter approximation, qY^, is simple enough to try determining its asymptotic 
distribution (appropriately centered and scaled). Indeed, 

M{T,n,m):= sup \GY^{x)\^T-^^^m^/^ max {Ci^)}, (C.5) 

xe[a,b] l<j<m 

where {Cl'^''} j are independent copies of the r.v. C'^'"') defined in (4.3). The following result 
obtains the asymptotic distributions of M„i := maxi<j<„i{Q } for the cases fc = and 
fc = l. 

Lemma C.6. Let a„ and b„ be as in (4.5)-(4.6). The following limits then hold: 

hm pf max {4°^} < ^ + b^,) = e-^^'\ (C.6) 

lim pf2-i max {C^^'} < ^ + b„,^ = e-^-^"" (C.7) 
m^oo y l<j<m a„i^ J 

for all yeR+. 

Proof. The limit (C.6) follows from the well-known identity 

lim m(l-$(^i„(y)))=e-^ (C.8) 

m— >oo 

where $ is the normal distribution and u„i{y) ~ y/a„i + bm- Indeed, for large enough m, 
the probability in (C.6) can be written as follows: 

(2$(«,„(y)) - 1)" = - 2m(l-^Kn(i/)))y"^^-.e^.^ 

To handle the case fc = 1 , we embed the problem into the theory of multivariate extreme 
values (see, e.g., [16]). Consider independent copies {Vi}.; of the following vector of jointly 
standard Gaussian variables: 

V:^(iz. + fz„i;J.-^Z,)'. (C,9) 

Since C''^^ = |^o| + \/3|^i|, we can see that 

(2-1 max {cW}<^+5,„j 

l_ l<j<m a,n J 

= \ max V; < a^V + bm , min V; > -a„V - b„ L 
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where y := {y,y)', b„i := (&m,&m)', a-m := [am, am)' and all operations are pointwise. 
Then, (C.7) will follow from the following identity: 

lim P( max Yi < a^y + hm, niin Vi>— a^/z — b„ 

m— >oo \l<i<m l<i<m 

(C.IO) 

for any y = (2/1,2/2)' and z — (zi,Z2)'. To show (C.IO), first note that the probability 
therein can be written as 

An {P(-w„(zi) <Vi< w„(yi),-u„(z2) < ^2 < ?in(y2))}", 

where V := (Vi, V2)' is defined in (C.9) and Un{x) := .i;/a„ + 6„. Let 

F„(y,z;X,r) := F{X > u„{y),Y > u„iz)), K{y;X) F{X > Un{y)), 

where X and Y represent random variables. We recall the following results valid for any 
jointly normal variables X and Y and arbitrary y and z (sec [16], Example 5.3.1): 

lim nF„(y, z; X, Y) = 0, lim nF„{y; X) = 0"^ 
Then, (C.IO) follows once we note that A}/"" can be written as follows: 

4V" = 1 - -{nFn{zi;Vi) + nF„{zr, V2) + nF„iyi;-Vi) + nF„{y2;~V2) 
n 

- nFn{zi,Z2;Vi,V2) ~ nFniyi,Z2-,-Vi,V2) - n'Fnizi,y2-,Vi,-V2)}. □ 
In view of (C.5), the following arc easy consequences of the above lemma: 

lim pfc/2„j-i/2 sup |6y^(x)|<^+6™„) =0-2--", (C.ll) 

V xG[a,b] flm,. / 

lim pf2-iry2™-i/2 sup \eY^^{x)\ < J^+b^^ = c-4<="^ (C.12) 

valid for all y G K+ , T„ > and m„ such that m„ — > 00 . We are now ready to prove the 
main theorem of Section 4: 

Proof of Theorem 4.1. The idea is to use the following simple observations. Let £„ 
be a functional on D[a,b] such that 

\Cn{iOi) - C„{lU2)\ < MJui-i02\\ (C.13) 

and let An,Bn be processes with values on D[a,b] such that \\An — i?„|| =Op(l/A#„). 
Then, if £„(A„) converges in distribution to F, £„(i?„) will also converge to F. Through- 
out this proof. 
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£„(w) :=a„i,J K • - • • sup |s"^/^(a;)w(a;)| - 6„ 

which satisfies the Lipschitz condition (C.13) with il/„ ~ ^am,Tn^'^ /m]-!"^ . From Lemma 
C.5, in order for (C.12) to hold with 6^t1 replaced by 5^7?^, it suffices that 

hm — 77?"™r.Jn = 

which is obvious since m„ — > 00. Since 4i^T„ '^^s t'^^ same law as ^Y^^, (C.12) also holds 
for aYt^ - In the light of Lemma C.4, (C.12) will hold for dXT^ (and, hence, for 2i^TW 
well) since 

lim ^^am„m„n"^/^log^/^ 7^ =c lim ( to„ logTO„ • — log ) =0, 



n— ^00 , 



which follows from condition (ii) in the statement of Theorem 4.1. Similarly, in view of 
Lemma C.3, (C.12) will hold for lY^^ (and hence, for o^tL well) since 

Indeed, the above expression is upper bounded by ( Y^"^ , which converges to 

because of assumption (i) and the fact that m„ — > 00. Finally, in the light of Lemma 
C.2, in order for (C.12) to hold for , it suffices that 

rpl/2 

hm — -;^a„„ — logn = 0. 



TO' 



which follows from assumption (ii) in the statement of Theorem 4.1. □ 

Proof of Corollary 4.2. Using the same reasoning as in the proof of Theorem 3.1, it 
turns out that 

sup |EsJ (x)-s(a;)| < aY^^^Vto-" 
xe[a,b] " \ n 

for an absolute constant K. As in the proof of Theorem 4.1, to show (4.9), it suffices that 

hm — r7ja,„„ V m„ =0, 

"^°°TO„/-^ V " / 

which holds in light of assumption (iii) in the statement of Corollary 4.2. □ 
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